Simulation-based optimization of Markov reward processes
نویسندگان
چکیده
منابع مشابه
Simulation-based optimization of Markov reward processes
We propose a simulation based algorithm for optimizing the average reward in a Markov Reward Process that depends on a set of parameters As a special case the method applies to Markov Decision Processes where optimization takes place within a parametrized set of policies The algorithm involves the simulation of a single sample path and can be implemented on line A convergence result with probab...
متن کاملSimulation-Based Optimization of Markov Reward Processes: Implementation Issues
We consider discrete time, finite state space Markov rewaxd processes which depend on a set of parameters. Previously, we proposed a simulation-based methodology to tune the parameters to optimize the average reward. The resulting algorithms converge with probability 1, but may have a high variance. Here we propose two approaches to reduce the variance, which however introduce a new bias into t...
متن کاملGradient - Based Optimization of Markov Reward Processes :
We consider a discrete time, nite state Markov reward process that depends on a set of parameters. In earlier work, we proposed a class of (stochastic) gradient descent methods that tune the parameters in order to optimize the average reward, using a single (possibly simulated) sample path of the process of interest. The resulting algorithms can be implemented online, and have the property that...
متن کاملSimulation-Based Optimization Of Markov Controlled Processes With Unknown Parameters
We consider simulation-based gradient-estimation and its use in Markov controlled processes with unknown parameters. We consider a Markov reward process controlled by both a set of tunable parameters, and a set of fixed but unknown. We analyze the use a recursive identification procedure, and their application to existing gradient-based algorithms based on simulation. We show that simple modifi...
متن کاملSimulation-Based Optimization Algorithms for Finite-Horizon Markov Decision Processes
We develop four simulation-based algorithms for finite-horizon Markov decision processes. Two of these algorithms are developed for finite state and compact action spaces while the other two are for finite state and finite action spaces. Of the former two, one algorithm uses a linear parameterization for the policy, resulting in reduced memory complexity. Convergence analysis is briefly sketche...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Automatic Control
سال: 2001
ISSN: 0018-9286
DOI: 10.1109/9.905687